{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# ResNets\n",
    "\n",
    "## 什么是深度残差网络？\n",
    "\n",
    "原始论文 https://arxiv.org/pdf/1512.03385.pdf \n",
    "\n",
    "\n",
    "\n",
    "当我们使用深层网络的时候，它们能够模拟很深层次的复杂函数，同时也可以从不同的抽象层次上提取特征。这一点会比浅层网络强很多。\n",
    "\n",
    "但是天下没有免费的午餐，深层网络有优点自然也有缺点，那就是——梯度消失(在做反向传播的时候，因为每一步都要乘以一个权重矩阵，这样就容易导致传播到第一层的时候，梯度接近于0)\n",
    "\n",
    "但是在Resnet中 \"shortcut\" 的存在允许梯度直接反向传播到前一层，这样就可以帮我们缓解梯度消失的现象。\n",
    "\n",
    "核心的 idea 请参考 课程CS231n 或者 原始论文，这里不做赘述。\n",
    "\n",
    "![ResNetsBlock](./images/shortcut.png)\n",
    "\n",
    "\n",
    "\n",
    "ResNet是由很多小的block组成,每个block的组成如下图所示：\n",
    "\n",
    "![ResNetsBlock](./images/block.png)\n",
    "\n",
    "\n",
    "\n",
    "ResNet34的结构图如下所示：\n",
    "\n",
    "![ResNets](./images/resNets.jpg)\n",
    "\n",
    "\n",
    "\n",
    "所以，我们只需要在实现block上，做一些小技巧，就可以了。\n",
    "\n",
    "\n",
    "上文中的图，[参考 DeepLearning.ai 的课程4](https://www.coursera.org/learn/convolutional-neural-networks/home/welcome)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import torch\n",
    "from torch import nn\n",
    "from torch.nn import functional as F\n",
    "from torch.autograd import Variable\n",
    "# 我们这里以 ResNets34 为例子\n",
    "\n",
    "# 先实现一个Block\n",
    "class Block(nn.Module):\n",
    "    def __init__(self, in_channel, out_channel, strides=1, same_shape=True):\n",
    "        super(Block, self).__init__()\n",
    "        self.same_shape = same_shape\n",
    "        if not same_shape:\n",
    "            strides = 2\n",
    "        self.strides = strides\n",
    "        self.block = nn.Sequential(\n",
    "            nn.Conv2d(in_channel, out_channel, kernel_size=3, stride=strides, padding=1, bias=False),\n",
    "            nn.BatchNorm2d(out_channel),\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.Conv2d(out_channel, out_channel, kernel_size=3, padding=1, bias=False),\n",
    "            nn.BatchNorm2d(out_channel)\n",
    "        )\n",
    "        if not same_shape:\n",
    "            self.conv3 = nn.Conv2d(in_channel, out_channel, kernel_size=1, stride=strides, bias=False)\n",
    "            self.bn3 = nn.BatchNorm2d(out_channel)\n",
    "    def forward(self, x):\n",
    "        out = self.block(x)\n",
    "        if not self.same_shape:\n",
    "            x = self.bn3(self.conv3(x))\n",
    "        return F.relu(out + x)\n",
    "\n",
    "# 开始实现 ResNets34\n",
    "class ResNet34(nn.Module):\n",
    "    def __init__(self, num_classes=10):\n",
    "        super(ResNet34, self).__init__()\n",
    "        # 最开始的几层\n",
    "        self.pre = nn.Sequential(\n",
    "                nn.Conv2d(3, 64, 7, 2, 3, bias=False),\n",
    "                nn.BatchNorm2d(64),\n",
    "                nn.ReLU(inplace=True),\n",
    "                nn.MaxPool2d(3, 2, 1))\n",
    "        # 从论文的图中，可以看到，我们有3，4，6，3个block\n",
    "        self.layer1 = self._make_layer(64, 64, 3)\n",
    "        self.layer2 = self._make_layer(64, 128, 4, stride=2)\n",
    "        self.layer3 = self._make_layer(128, 256, 6, stride=2)\n",
    "        self.layer4 = self._make_layer(256, 512, 3, stride=2)\n",
    "\n",
    "        # 分类用的全连接\n",
    "        self.fc = nn.Linear(512, num_classes)\n",
    "    \n",
    "    def _make_layer(self,  in_channel, out_channel, block_num, stride=1):\n",
    "        layers = []\n",
    "        if stride != 1:\n",
    "            layers.append(Block(in_channel, out_channel, stride, same_shape=False))\n",
    "        else:\n",
    "            layers.append(Block(in_channel, out_channel, stride))\n",
    "        \n",
    "        for i in range(1, block_num):\n",
    "            layers.append(Block(out_channel, out_channel))\n",
    "        return nn.Sequential(*layers)\n",
    "    \n",
    "    # 在jupyter notebook中，可以尝试输出每一层的size，来查看每一层的输入、输出是否正确。\n",
    "    def forward(self, x):\n",
    "        x = self.pre(x)\n",
    "        print(\"pre层的size是：\", x.size())\n",
    "        x = self.layer1(x)\n",
    "        print(\"layer1的size是：\", x.size())\n",
    "        x = self.layer2(x)\n",
    "        print(\"layer2的size是：\", x.size())\n",
    "        x = self.layer3(x)\n",
    "        print(\"layer3的size是：\", x.size())\n",
    "        x = self.layer4(x)\n",
    "        print(\"layer4的size是：\", x.size())\n",
    "        x = F.avg_pool2d(x, 7)\n",
    "        x = x.view(x.size(0), -1)\n",
    "        print(\"最后的结果是：\", x.size())\n",
    "        return self.fc(x)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "pre层的size是： torch.Size([1, 64, 56, 56])\n",
      "layer1的size是： torch.Size([1, 64, 56, 56])\n",
      "layer2的size是： torch.Size([1, 128, 28, 28])\n",
      "layer3的size是： torch.Size([1, 256, 14, 14])\n",
      "layer4的size是： torch.Size([1, 512, 7, 7])\n",
      "最后的结果是： torch.Size([1, 512])\n",
      "torch.Size([1, 10])\n"
     ]
    }
   ],
   "source": [
    "resnet = ResNet34()\n",
    "x = Variable(torch.randn(1, 3, 224, 224))\n",
    "print(resnet(x).size())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": false
   },
   "source": [
    "上面实现的是较为简单的 ResNets34， 还有更加复杂的 ResNets50 ResNets101 ResNets152 它们的实现需要用到另外一种Block，称为 Bottleneck，Bottleneck的实现如下，然后可以自行尝试实现 ResNets50 来检验。\n",
    "\n",
    "论文中ResNets的表格：\n",
    "\n",
    "\n",
    "![ResNetsBlock](./images/resnets_table.png)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# 先实现一个Block\n",
    "class Bottleneck(nn.Module):\n",
    "    def __init__(self, in_channel, out_channel, strides=1, same_shape=True, bottle=True):\n",
    "        super(Bottleneck, self).__init__()\n",
    "        self.same_shape = same_shape\n",
    "        self.bottle = bottle\n",
    "        if not same_shape:\n",
    "            strides = 2\n",
    "        self.strides = strides\n",
    "        self.block = nn.Sequential(\n",
    "            nn.Conv2d(in_channel, out_channel, kernel_size=1, bias=False),\n",
    "            nn.BatchNorm2d(out_channel),\n",
    "            nn.ReLU(inplace=True),\n",
    "            nn.Conv2d(out_channel, out_channel, kernel_size=3, stride=strides, padding=1, bias=False),\n",
    "            nn.BatchNorm2d(out_channel),\n",
    "            nn.Conv2d(out_channel, out_channel*4, kernel_size=1, bias=False),\n",
    "            nn.BatchNorm2d(out_channel*4)\n",
    "        )\n",
    "        if not same_shape or not bottle:\n",
    "            self.conv4 = nn.Conv2d(in_channel, out_channel*4, kernel_size=1, stride=strides, bias=False)\n",
    "            self.bn4 = nn.BatchNorm2d(out_channel*4)\n",
    "            print(self.conv4)\n",
    "    def forward(self, x):\n",
    "        print(x.size())\n",
    "        out = self.block(x)\n",
    "        print(out.size())\n",
    "        if not self.same_shape or not self.bottle:\n",
    "            x = self.bn4(self.conv4(x))\n",
    "        return F.relu(out + x)"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "kernelspec": {
   "display_name": "Python [default]",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}
